Skip to content

feat!: migrate to scrapegraph-py v2 API surface#1058

Open
VinciGit00 wants to merge 2 commits intomainfrom
feat/migrate-to-scrapegraph-py-v2
Open

feat!: migrate to scrapegraph-py v2 API surface#1058
VinciGit00 wants to merge 2 commits intomainfrom
feat/migrate-to-scrapegraph-py-v2

Conversation

@VinciGit00
Copy link
Copy Markdown
Member

@VinciGit00 VinciGit00 commented Mar 31, 2026

Summary

  • Migrate all scrapegraph-py SDK usage to the new v2 API surface (see feat!: migrate Python SDK to v2 API surface scrapegraph-py#82)
  • Bump dependency from scrapegraph-py>=1.44.0 to >=2.0.0
  • Update core integration in SmartScraperGraph and all 3 example scripts
  • Pass output_schema to extract() so Pydantic schemas are forwarded to the v2 API
  • Use context manager pattern (with Client(...) as client) for proper resource cleanup

API mapping

v1 Method v2 Method Endpoint
smartscraper(website_url=, user_prompt=) extract(url=, prompt=, output_schema=) POST /api/v2/extract
searchscraper(user_prompt=) search(query=) POST /api/v2/search
markdownify(website_url=) scrape(url=) POST /api/v2/scrape
get_credits() credits() GET /api/v2/credits
generate_schema() (removed)
crawl() / get_crawl() crawl.start() / crawl.status() / .stop() / .resume() /api/v2/crawl
scheduled jobs monitor.create() / .list() / .pause() / .resume() / .delete() /api/v2/monitor
history() GET /api/v2/history

Other v2 changes (from scrapegraph-py)

  • Auth now sends both Authorization: Bearer and SGAI-APIKEY headers
  • New shared models: FetchConfig (with FetchMode enum: auto/fast/js/direct+stealth/js+stealth), LlmConfig
  • scrape() supports format: markdown, html, screenshot, branding
  • extract() and search() accept output_schema (dict or Pydantic BaseModel)
  • Context manager support (with Client(...) as client:)
  • Removed: markdownify, agenticscraper, sitemap, healthz, feedback, all scheduled job methods

Breaking Change

Requires scrapegraph-py>=2.0.0.

Test plan

  • Verify SmartScraperGraph with llm_model="scrapegraphai/smart-scraper" works against v2 API
  • Verify Pydantic schema is correctly forwarded via output_schema
  • Run example scripts against live API
  • Existing tests pass (no other code paths affected)

🤖 Generated with Claude Code

Update all SDK usage to match the new v2 API from ScrapeGraphAI/scrapegraph-py#82:
- smartscraper() → extract(url=, prompt=)
- searchscraper() → search(query=)
- markdownify() → scrape(url=)
- Bump dependency to scrapegraph-py>=2.0.0

BREAKING CHANGE: requires scrapegraph-py v2.0.0+

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@dosubot dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Mar 31, 2026
@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 31, 2026

Dependency Review

The following issues were found:
  • ✅ 0 vulnerable package(s)
  • ✅ 0 package(s) with incompatible licenses
  • ✅ 0 package(s) with invalid SPDX license definitions
  • ⚠️ 1 package(s) with unknown licenses.
See the Details below.

Snapshot Warnings

⚠️: No snapshots were found for the head SHA c0f5fd5.
Ensure that dependencies are being submitted on PR branches and consider enabling retry-on-snapshot-warnings. See the documentation for more information and troubleshooting advice.

License Issues

pyproject.toml

PackageVersionLicenseIssue Type
scrapegraph-py>= 2.0.0NullUnknown License

OpenSSF Scorecard

PackageVersionScoreDetails
pip/scrapegraph-py >= 2.0.0 UnknownUnknown

Scanned Files

  • pyproject.toml

@dosubot dosubot bot added dependencies Pull requests that update a dependency file enhancement New feature or request labels Mar 31, 2026
- Pass output_schema to extract() so Pydantic schemas are forwarded to the v2 API
- Use context manager pattern (with Client(...) as client) for proper resource cleanup
- Simplify examples to match the v2 SDK style from scrapegraph-py
- Remove unused sgai_logger import (v2 client handles its own logging)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dependencies Pull requests that update a dependency file enhancement New feature or request size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant